Functional Association Networks for Disease Gene Prediction

نویسنده

  • Dimitri Guala
چکیده

Mapping of the human genome has been instrumental in understanding diseases caused by changes in single genes. However, disease mechanisms involving multiple genes have proven to be much more elusive. Their complexity emerges from interactions of intracellular molecules and makes them immune to the traditional reductionist approach. Only by modelling this complex interaction pattern using networks is it possible to understand the emergent properties that give rise to diseases. The overarching term used to describe both physical and indirect interactions involved in the same functions is functional association. FunCoup is one of the most comprehensive networks of functional association. It uses a naïve Bayesian approach to integrate high-throughput experimental evidence of intracellular interactions in humans and multiple model organisms. In the first update, both the coverage and the quality of the interactions, were increased and a feature for comparing interactions across species was added. The latest update involved a complete overhaul of all data sources, including a refinement of the training data and addition of new class and sources of interactions as well as six new species. Disease-specific changes in genes can be identified using high-throughput genome-wide studies of patients and healthy individuals. To understand the underlying mechanisms that produce these changes, they can be mapped to collections of genes with known functions, such as pathways. BinoX was developed to map altered genes to pathways using the topology of FunCoup. This approach combined with a new randommodel for comparison enables BinoX to outperform traditional gene-overlap-based methods and other networkbased techniques. Results from high-throughput experiments are challenged by noise and biases, resulting in many false positives. Statistical attempts to correct for these challenges have led to a reduction in coverage. Both limitations can be remedied using prioritisation tools such as MaxLink, which ranks genes using guilt by association in the context of a functional association network. MaxLink’s algorithm was generalised to work with any disease phenotype and its statistical foundation was strengthened. MaxLink’s predictions were validated experimentally using FRET. The availability of prioritisation tools without an appropriate way to compare them makes it difficult to select the correct tool for a problem domain. A benchmark to assess performance of prioritisation tools in terms of their ability to generalise to new data was developed. FunCoup was used for prioritisation while testing was done using cross-validation of terms derived from Gene Ontology. This resulted in a robust and unbiased benchmark for evaluation of current and future prioritisation tools. Surprisingly, previously superior tools based on global network structure were shown to be inferior to a local network-based tool when performance was analysed on the most relevant part of the output, i.e. the top ranked genes. This thesis demonstrates how a network that models the intricate biology of the cell can contribute with valuable insights for researchers that study diseases with complex genetic origins. The developed tools will help the research community to understand the underlying causes of such diseases and discover new treatment targets. The robust way to benchmark such tools will help researchers to select the proper tool for their problem domain.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ligand Similarity Complements Sequence, Physical Interaction, and Co-Expression for Gene Function Prediction

The expansion of protein-ligand annotation databases has enabled large-scale networking of proteins by ligand similarity. These ligand-based protein networks, which implicitly predict the ability of neighboring proteins to bind related ligands, may complement biologically-oriented gene networks, which are used to predict functional or disease relevance. To quantify the degree to which such liga...

متن کامل

Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression

With the growing availability of large-scale biological datasets, automated methods of extracting functionally meaningful information from this data are becoming increasingly important. Data relating to functional association between genes or proteins, such as co-expression or functional association, is often represented in terms of gene or protein networks. Several methods of predicting gene f...

متن کامل

Comparing MicroRNA Target Gene Predictions Related to Alzheimer's Disease Using Online Bioinformatics Tools

Introduction: The prediction of microRNAs related to target genes using bioinformatics tools saves time and costs of the experimental analyses. In the present study, the prediction of microRNA target genes relevant to Alzheimer’s Diseases (AD) were compared with the experimentally reported data using different bioinformatics tools. Method: A total of 41 microRNAs associated with 21 essential ge...

متن کامل

Comparing MicroRNA Target Gene Predictions Related to Alzheimer's Disease Using Online Bioinformatics Tools

Introduction: The prediction of microRNAs related to target genes using bioinformatics tools saves time and costs of the experimental analyses. In the present study, the prediction of microRNA target genes relevant to Alzheimer’s Diseases (AD) were compared with the experimentally reported data using different bioinformatics tools. Method: A total of 41 microRNAs associated with 21 essential ge...

متن کامل

Prediction of Blasting Cost in Limestone Mines Using Gene Expression Programming Model and Artificial Neural Networks

The use of blasting cost (BC) prediction to achieve optimal fragmentation is necessary in order to control the adverse consequences of blasting such as fly rock, ground vibration, and air blast in open-pit mines. In this research work, BC is predicted through collecting 146 blasting data from six limestone mines in Iran using the artificial neural networks (ANNs), gene expression programming (G...

متن کامل

IL-23 Receptor Gene rs7517847 and rs1004819 SNPs in Ulcerative Colitis

Background: Crohn’s disease (CD) and ulcerative colitis (UC) are two major clinical presentations of inflammatory bowel disease (IBD). Many novel candidate genes have been found to be associated with increased risk for IBD. Recently IL-23 receptor gene is identified as an IBD associated gene in genome-wide studies. Objective: To ascertain whether rs7517847 and rs1004819 SNPs in the IL-23 recept...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017